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Abstract. - We propose a method to reconstruct and analyze a complex network from data 
generated by a spatio-temporal dynamical system, relying on the nonlinear mutual information 
of time series analysis and betweenness centrality of complex network theory. We show, that this 
approach reveals a rich internal structure in complex climate networks constructed from reanalysis 
and model surface air temperature data. Our novel method uncovers peculiar wave-like structures 
of high energy flow, that we relate to global surface ocean currents. This points to a major role of 
the oceanic surface circulation in coupling and stabilizing the global temperature field in the long 
term mean (140 years for the model run and 60 years for reanalysis data). We find that these 
results cannot be obtained using classical linear methods of multivariate data analysis, and have 
ensured their robustness by intensive significance testing. 



Introduction. — In the last decade, the complex net- 
work paradigm has proven to be a fruitful tool for the in- 
vestigation of complex systems in various areas of science, 
e.g., the internet and world wide web in computer science, 
food webs, gene expression and neural networks in biol- 
ogy, and citation networks in social science [l]. The intri- 
cate interplay between the structure and dynamics of real 
networks has received considerable attention 12]. Particu- 
larly, synchronization arising by the transfer of dynamical 
information in complex network topologies has been stud- 
ied intensively (3j. The application of complex network 
theory to climate science is a young field, where only few 
studies have been reported recently [4]-[8]. The vertices 
of a climate network are identified with the spatial grid 
points of an underlying global climate data set. Edges are 
added between pairs of vertices depending on the degree 
of statistical interdependence between the corresponding 
pairs of anomaly time series taken from the climate data 
set. Climate networks enable novel insights into the topol- 
ogy and dynamics of the climate system over many spatial 
scales ranging from local properties as the number of first 
neighbors of a vertex v (the degree centrality k v ) to global 
network measures such as the clustering coefficient or the 
average path length. The local degree centrality and re- 
lated measures have been used to identify supernodes (re- 
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gions of high degree centrality) and to associate them to 
known dynamical interrelations in the atmosphere, called 
teleconnection patterns, most notably the North Atlantic 
Oscillation (NAO) 4|. On the global scale, climate net- 
works were found to possess "small-world" properties due 
to long range connections (edges linking geographically 
very distant vertices), that stabilize the climate system 
and enhance the energy and information transfer within it 
HI . By studying the prevalence of long range connections 
in El Nino and La Nina climate networks [5: and the time 
dependence of the number of stable edges [6] , it has been 
shown very recently, that the El Nino-Southern Oscilla- 
tion (ENSO) has a strong impact on the stability of the 
climate system. 

Until now, researchers have used the linear cross- 
correlation function of pairs of anomaly time series to 
quantify the degree of statistical interdependence between 
different spatial regions. But the highly nonlinear pro- 
cesses at work in the climate system call for the appli- 
cation of nonlinear methods to obtain more reliable re- 
sults. Here we also use mutual information [9] to con- 
struct climate networks allowing to capture linear and 
nonlinear relationships between time series [8j. Further- 
more we use a measure of vertex centrality, betweenness 
(BC), that is defined locally but takes into account global 
topological information. Combining these two techniques, 
we uncover peculiar wave-like structures in the BC fields 
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of climate networks constructed from monthly averaged 
reanalysis and atmosphere-ocean coupled general circu- 
lation model (AOGCM) surface air temperature (SAT) 
data. Akin to the homonymous data highways of the in- 
ternet, these BC structures form the backbone of the SAT 
network, bundling most of the energy flow between remote 
regions. Some major features of the backbone appear to 
be closely related to surface ocean currents pointing to an 
essential role of the oceanic surface circulation in stabi- 
lizing the climate system by promoting the global flow of 
energy, mainly in the form of heat. Note that these in- 
sights are conceptually new and cannot be obtained using 
classical methods of climatology such as principal compo- 
nent analysis (PCA) or singular spectrum analysis (SSA) 
of anomaly fields [lOj, because these are by design local 
in a network sense and are not suitable to study local 
flow measures depending on a global network topology. 
We have performed intensive statistical tests with vari- 
ous types of surrogates to ensure the robustness of our 
results. The methodology developed in this letter has the 
potential to be universally applicable to extract the en- 
ergy, matter or information flow structure in any spatially 
extended dynamical system from observations taken from 
the real world, experiments and simulations. Our results 
are hence of interest for several branches of physics as well 
as various applications, e.g., fluid dynamics (turbulence), 
plasma physics, biological physics (population dynamics, 
neural networks, cell models). As demonstrated by its ap- 
plication to the climate system, our method is particularly 
relevant for the analysis of systems with highly hetero- 
geneous boundary conditions and forcing fields, that are 
found frequently in nature. 

Data. — We utilize the monthly averaged global SAT 
field to construct climate networks, that allows to directly 
capture the complex dynamics on the interface between 
ocean and atmosphere due to heat exchange and other 
local processes. SAT therefore enables us to study atmo- 
spheric as well as oceanic dynamics using the same climate 
network. We use reanalysis data provided by the National 
Center for Environmental Prediction/National Center for 
Atmospheric Research (NCEP/NCAR) [ll] and model 
output from the World Climate Research Programme's 
(WCRP's) Coupled Model Intercomparison Project phase 
3 (CMIP3) multi-model data set 12 . For optimal compa- 
rability with the reanalysis data, we choose a 20th century 
reference run (20c3m, as defined in the IPCC AR4) by the 
Hadley Centre HadCM3 model. A data set consists of a 
regular spatio-temporal grid with time series Xi{t) associ- 
ated to every spatial grid point i at latitude Xi and lon- 
gitude fa. Start and end dates, length of time series T, 
latitudinal resolution AA, longitudinal resolution Acf> and 
the number of vertices of the corresponding global climate 
network N are given in table [1] 

Methodology. — (i) To minimize the bias introduced 
by the external solar forcing common to all time series in 
the data set, we calculate anomaly time series ai(t) from 



Table 1: Properties of global surface air temperature data sets 





NCEP/NCAR 


HadCM3 


Period 


01/1948 - 12/2007 


01/1860 - 12/1999 


T [months] 


720 


1,680 


AA [°] 


2.5 


2.5 


Acf> [°] 


2.5 


3.75 


N 


10,224 


6,816 



the Xi(t), i.e., remove the mean annual cycle by phase 
averaging. Up to this point, we follow the method used 
previously by [5j[6]- It is known, that the annual cycle 
induces higher order effects such as seasonal variability of 
anomaly time series variance. We find that using only data 
from a particular season to avoid biases due to this effect 
does not alter our results substantially, so that we choose 
to use the whole data set for a more accurate evaluation of 
interdependence. Furthermore we normalize the anomaly 
time series to zero mean and unit variance. 

(ii) Mutual information (MI) is a measure from informa- 
tion theory, that can be interpreted as the excess amount 
of information generated by falsely assuming the two time 
series dj and aj to be independent, and is able to detect 
linear as well as nonlinear relationships 9|. MI can be 
estimated using 



(1) 



where Pi(fi) is the probability density function (PDF) of 
the time series a^, and pij(p,v) is the joint PDF of a 
pair (a^aj). By definition, My is symmetric, so that 
My = Mji. Note that in principle, one can evaluate a 
time delayed MI [9] . This is appropriate when studying cli- 
mate networks on smaller time scales using data sets with 
(sub-)diurnal resolution [6]. However, in the present work, 
we intend to study long term structural properties of the 
climate system on a scale of O(10 2 ) years using monthly 
averaged data. Most physical mechanisms of global infor- 
mation transfer in the SAT field such as travelling Rossby 
waves, heat exchange between ocean and atmosphere or 
the advection of heat by surface currents in the ocean act 
at time scales of less than one month. Therefore, it is rea- 
sonable to calculate MI at zero lag between anomaly time 
series. 

(iii) We now construct the climate network by thresh- 
olding the MI matrix My, i.e., only pairs of vertices 
that satisfy My > r are regarded as linked, where r is 
the threshold. Using the Heaviside function O(x), the 
adjacency matrix Ay of the climate network is given by 
Ay = O (My — t) — Sij , where <5y is the Kronecker delta. 
Note that Ay inherits its symmetry from My and the re- 
sulting climate network is an undirected and unweighted 
simple graph. One could construct a network with edges 
(i,j) weighted by My. At this stage, we keep our method 
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simple by studying an unweighted network. We suggest 
that weight information could help to identify the back- 
bone structure even more clearly, however, would not alter 
our conclusions below, because we use only a small number 
of edges with high MI that dominate the network. 

(iv) We find, that network characteristics, such as BC, 
clustering coefficient and average path length, are depen- 
dent on the choice of the threshold r. When comparing 
climate networks constructed from AOGCM and reanaly- 
sis data, it is consequently more meaningful to constrain 
the edge density p — 2E/{N{N — 1)), where E gives the 
total number of edges, than to fix t as it was done in all 
earlier works |4j|6j. The threshold r = r(p) is thus chosen 
to yield a prescribed edge density p. The PDF of MI P(M) 
over all pairs Mij is found to have a connected support, 
so that the edge density function p(r) — f dMP(M) is 
strictly monotonic decreasing with r and induces a one 
to one correspondence between r and p. Note that the 
backbone of the climate network is most clearly observed 
at small p with corresponding large threshold r, that is 
very unlikely to be exceeded by chance, as we reassured 
using significance tests based on randomly shuffled time 
series, Fourier surrogates [9] and twin surrogates 13 . We 
fix the edge density at p = 0.005, resulting in thresholds 
of Ti = 0.398 for the HadCM3 data and r 2 = 0.624 for the 
reanalysis data. The remaining 0.5% of all possible edges 
correspond to statistically significant and robust relation- 
ships (see below). In concordance with this observation, 
we find that small variations of p from the chosen value 
do not alter the backbone structure significantly. The re- 
maining edges are distributed heterogenously as they at- 
tach preferentially to pronounced supernodes, their range 
extending from local to global (teleconnections), which is 
consistent with earlier works (4|[8j- 

(v) Having constructed a climate network, we can fi- 
nally quantify the importance of a small part of the earth's 
surface (represented by a single vertex v) for the global 
flow of energy within the SAT field, that gives rise to 
the pairwise dynamical interdependencies measured by 
MI. Vertices play distinct roles in the energy transmis- 
sion throughout the network, some of them show a higher 
capability as compared to others. This capability can be 
quantified by the betweenness BC V . Assume that energy 
travels through the network on shortest paths. We then 
regard a vertex v to be important for the energy transport 
in the network, if it is traversed by a large number of all 
existing shortest paths 14 . The betweenness centrality is 
defined as 
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BC V = 
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Vv € (1, . . . , N), where <jy (v) gives the number of shortest 
paths from i to j, that include v. Here the contribution 
of shortest paths is weighted by their respective multi- 
plicity aij. Because the shortest paths considered contain 
only edges corresponding to pairs of highly dynamically 



interrelated time series, BC can be interpreted as a local 
measure of dynamical information flow. Since we use it 
to analyze a temperature field we nevertheless prefer to 
view BC more fundamentally as a measure of the flow 
of energy, mainly in the form of heat. BC is conceptu- 
ally distinct from other commonly used vertex centrality 
measures, e.g., degree and closeness centrality, and hence 
enables us to uncover interesting novel structural features 
of climate networks |8j. 

Results. — Following the method outlined above, 
we uncover peculiar wave-like structures of high BC in 
fields of both reanalysis and model SAT climate networks 
(fig. [I]). In analogy with the internet, we call the net- 
work of these channels of high energy flow the backbone 
of the climate network. We observe that prominent mainly 
meridional features of the backbone tend to approach the 
equator tangentially, as one would expect from modes of 
the atmospheric and oceanic general circulation due to the 
vanishing coriolis force at the equator [l7j. There is also 
a qualitative agreement on the location of major back- 
bone structures for both reanalysis (fig. 1 (a) I and model 
networks (fig. 1 1(b) ), e.g., the high BC channel over the At- 
lantic Ocean and the backbone structures over the east- 
ern Pacific Ocean, both connecting the Arctic with the 
Antarctic. 

The BC field of the MI climate networks considered 
here shows qualitative and quantitative deviations when 
compared to climate networks constructed using the lin- 
ear Pearson correlation (PC) (fig. [2]), while the backbone 
is clearly seen in both types of networks. The observed 
deviations could be partly due to a heterogeneous distri- 
bution of Shannon entropy among the at(t), introducing 
a bias in the calculation of My. Also it is well known 
that the nonlinear features of temperature dynamics might 
vary among reanalysis data sets, among climate models as 
well as between reanalysis and model data sets. Neverthe- 
less we argue in [8], that some of the deviations in the BC 
field may be attributable to nonlinear physical processes in 
the climate system. Particularly, we find that edges cor- 
responding to nonlinear statistical interrelationships are 
present in the MI climate network, which are excluded in 
the PC network at the same restrictive edge density level. 

Note that the strongest backbone structures lie mainly 
over the ocean and avoid to cross the land in both model 
and reanalysis climate networks. Therefore a physical 
mechanism involving an atmosphere-ocean coupling might 
contribute to the energy transport in the SAT field mea- 
sured by BC. Indeed, some of the strongest features found 



in t he NC EP/NCAR and HadCM3 BC fields (fig. [1(a) 
and 1(b) I resemble closely major surface ocean currents 
(fig. 1(c) j. For example, note the strong BC structures off 
the west coast of North and South America that resemble 
the Alaska and Peru current, and the backbone feature 
along the west coasts of Africa and Europe following the 
path of the Canary and Norwegian currents. These obser- 
vations can be understood considering the strong coupling 
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Figure 2: Normalized 



90°W 0° 90°E 

Betweenness (logio(BC+1)) 



.437 .874 1.311 1.748 2.185 2.622 3.059 3.496 3.933 4.37 4.807 5.244 5.681 6.118 




Figure 1: a) BC for the NCEP/NCAR reanalysis SAT MI net- 
work, and b) for the HadCM3 SAT network. Both networks are 
constructed at edge density p = 0.005 using MI. c) A schematic 
map of global surface ocean currents, after 16 . Note that 
some features of the backbone in a) and b) correspond closely 
to ocean surface currents shown in c), e.g., the Alaska, Peru 
and Canary currents. 



between sea surface temperature (SST) and SAT over the 
ocean via heat flux. Temperature anomalies in SST are ad- 
vected by the surface ocean currents and transfered to the 
SAT field via heat flux coupling. Therefore, ocean currents 
provide a physical mechanism for the transport of energy 



difference field ABC V = \BC% - 
BC™\/y/{BC£) w {BCtf) w of BC fields BCf and BC„ M , re- 
spectively calculated from PC and MI HadCM3 SAT climate 
networks at p — 0.005. 



together with dynamical information on localized linear 
structures over large distances. However, no clear traces 
of the strong western boundary currents (WBCs) such as 
the Gulf Stream or the Kuroshio are visible in the back- 



bone structure (fig. 1(b)). This might be due to the fact, 
that WBCs are much narrower than the eastern boundary 
currents discussed above [17], so that the effect of WBCs is 
not resolved by the grid underlying the HadCM3 climate 
network (see table [l). Note that using higher resolution 
reanalysis data (fig. 1(a)) and SAT data taken from the 
AOGCM ECHAM5 [12]7 we find that our method does 
indeed detect WBCs. Here it should be pointed out again 
that we are analyzing the SAT field, hence purely atmo- 
spheric effects, e.g., planetary waves, also contribute to the 
BC field and might explain some of its wave-like features, 
particularly over land. 

Backbone structures are not seen in fields of the com- 
plementary random walk betweenness [15] which measures 
diffusive flow in a network. This further supports our argu- 
ment that shortest path betweenness (BC) measures con- 
vective energy flow in a spatially extended network and is 
consistent with extremalization principles of physics, e.g., 
the Hamiltonian principle, interpreted within a graph the- 
oretical framework. 

To exclude the possibility that the observed backbone 
structures over the ocean might be simply due to local 
anomalies in the SST-SAT gradient caused by surface cur- 
rents, we have calculated the gradient field from the model 
run that we used to construct the HadCM3 climate net- 
work, and found that the SST-SAT gradient and BC are 
not correlated (fig. [3]). Furthermore, the backbone is nei- 
ther seen in fields of degree nor closeness centrality [8|[l4] , 
while BC statistically shows some correlation with these 
centrality measures (fig. [4]). Nevertheless there is a no- 
table tendency of high BC vertices to have a small degree, 
suggesting that they act as bottlenecks of energy flow in 
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90°W 0° 90°E 

SAT-SST gradient [K] 



-38.31 -34.92 -31.54 -28.15 -24.77 -21.38 



1 .23 -7.85 -4.46 



Figure 3: The mean SAT-SST gradient field (AT v (t)) t = 
(SAT v (t)) t -(SST v (t)) t calculated from the HadCM3 SAT and 
SST data sets [12] , both taken from the 20c3m run. 



the network. Therefore we conclude that the backbone 
structures observed in model and reanalysis networks are 
neither a trivial response to local anomalies in the SST- 
SAT gradient nor artifacts of chains of supernodes with 
high degree and closeness centrality. 

It is noteworthy that due to the given grids (table [T]), 
the vertex density on the earth's surface is maximum at 
the poles and decreases towards the equator. However, 
preliminary results using a density invariant generaliza- 
tion of BC suggest that our results are not appreciably 
impacted by this inhomogeneity. 

Significance testing. — To ensure the statistical ro- 
bustness of our results on the network level, we test the 
null hypothesis, that the climate network is a random 
graph with a given degree sequence. Using the configu- 
ration model and a link switching method [l], we gener- 
ate a Monte Carlo ensemble of 100 networks, that have 
approximately the same degree sequence as the recon- 
structed climate network. We find that in sharp contrast 
to the reconstructed network, the ensemble mean BC se- 
quence is highly correlated to the degree sequence, and 
does not display the backbone structures observed in the 
reconstructed climate network (fig. 5(a)). Based on this 



evidence we reject the null hypothesis that the climate 
network is random and conclude, that the backbone is un- 
likely to be a trivial consequence of the degree sequence. 

An alternative is to test the statistical robustness on 
the time series level and to develop the null hypothesis 
that the time series of the SAT data set are pairwise in- 
dependent. Specifically, we generate 100 twin surrogates 
from the original time series at each grid point. Compared 
to shuffled time scries or Fourier surrogates, twin surro- 
gates yield a higher test power since they approximately 
conserve all linear and nonlinear properties of the origi- 
nal single time series. We then construct an ensemble of 
100 networks from the resulting surrogate data sets, again 
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Figure 4: Scatter plots of betweenness BC V against (a) de- 
gree k v and (b) closeness (CC V , 
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) for the HadCM3 SAT 
MI climate network at p — 0.005. Specifically, the Spear- 
man's rank correlation coefficients of the centrality fields are 
r a (k v ,BC v ) = 0.25 and r s {CC v ,BC v ) = 0.30. 



fixing the edge density at p = 0.005, and compute the 
ensemble mean degree and betweenness sequences. While 
interestingly, the ensemble mean degree sequence closely 
resembles the degree sequence of the climate network, the 
ensemble mean BC sequence is again highly correlated to 
the ensemble mean degree sequence and contains no back- 
bone structures (fig. |5(b)[ ) . Based on these observations, 
we reject the null hypothesis that the time series of the 
SAT data set are pairwise independent and infer, that the 
backbone indeed characterizes the intrinsic complex topol- 
ogy of dynamical interrelationships. We have performed 
these statistical tests for both the reanalysis and model 
climate networks and came to the same conclusions. 

Conclusions and outlook. — In summary, using mu- 
tual information from nonlinear time series analysis and 
betweenness from complex network theory, we have un- 
covered novel pathways of global energy and dynamical 
information flow in the climate system, that we call the 
backbone of the climate network. Two conceptually in- 
dependent types of tests reveal that the backbone does 
not arise by chance and is not a trivial consequence of the 
degree centrality sequence studied in previous works on 
climate networks, but on the contrary represents a sta- 
tistically significant feature of the underlying SAT data 
set. Surface ocean currents appear to play a major role 
in the energy and information transfer and hence in the 
dynamical stabilization of the climate system in the long 
term mean (140 years for the HadCM3 model run and 60 
years for the reanalysis data). We observe similar back- 
bone structures in AOGCM model output and reanalysis 
data giving confidence that the backbone is not a model 
artifact. It is important to realize that our complex net- 
work approach is an essential ingredient in the discovery 
of the backbone. The main advantage of betweenness is 
that it takes into account the global network topology of 
pairwise interrelationships between regions. However, the 
classical linear methods (PCA, SSA, etc. 10 ) widely ap- 



plied to disclose teleconnection patterns in climatology use 
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90°W 0° 90°E 

BC Z-score (logio) 



1.6 2 2.4 2.8 3.2 3.6 



4.4 4.8 5.2 



Figure 5: Z-score field Z v = (BC V - m{BC%)) / <t{BC%) of the 
BC field with respect to (a) a configuration model ensemble 
and (b) an ensemble based on twin surrogate data sets with 
both n = fOO members calculated for the HadCM3 SAT MI 
climate network at p = 0.005 [l8]. m(BC r v ) and o(BC r v ) de- 
note the ensemble mean and standard deviation of BC at each 
vertex v, respectively. If \Z V \ 1, we can consider BC V to 
be significant with respect to the chosen network model. The 
backbone is recognizable with a large Z-score, indicating its 
statistical significance. 



information about next neighbors at each grid point, and 
are therefore only local within the complex network frame- 
work. Our method is promising to next study the impact 
of extreme events such as strong El Ninos, extreme Mon- 
soons or volcanic eruptions on the topology of climate net- 
works. In the future it will thereby allow us to obtain new 
insights into the individual local signature of changes in 
the energy and information flow structure and stability 
of the climate system. Our method may also be valu- 
able to illuminate differences in the backbone structure 
of different climate states in earth's history, e.g., during 
glacial and interglacial episodes, and to assess the impact 
of global warming on the stability of the climate system 
from a different perspective. More generally, we emphasize 
that the methodology introduced in this letter is universal 



in the sense, that it can be applied to study the struc- 
ture of energy, matter and information flow within any 
spatially extended dynamical system. 
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